-
Notifications
You must be signed in to change notification settings - Fork 797
[RFC] Support dynamically loading LTTng tracing #1621
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Currently static compilation with LTTng tracing enabled fails with
the following errors:
In file included from /home/rdma-core/providers/rxe/rxe_trace.c:9:
/rdma-core/providers/rxe/rxe_trace.h:12:38: fatal error: rxe_trace.h: No such file or directory
12 | #define LTTNG_UST_TRACEPOINT_INCLUDE "rxe_trace.h"
| ^~~~~~~~~~~~~
compilation terminated.
make[2]: *** [providers/rxe/CMakeFiles/rxe.dir/build.make:76: providers/rxe/CMakeFiles/rxe.dir/rxe_trace.c.o] Error 1
make[2]: *** Waiting for unfinished jobs....
In file included from /home/rdma-core/providers/efa/efa_trace.c:9:
/home/rdma-core/providers/efa/efa_trace.h:12:38: fatal error: efa_trace.h: No such file or directory
12 | #define LTTNG_UST_TRACEPOINT_INCLUDE "efa_trace.h"
| ^~~~~~~~~~~~~
compilation terminated.
make[2]: *** [providers/efa/CMakeFiles/efa-static.dir/build.make:76: providers/efa/CMakeFiles/efa-static.dir/efa_trace.c.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:3085: providers/efa/CMakeFiles/efa-static.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
In file included from /home/rdma-core/providers/mlx5/mlx5_trace.c:9:
/home/rdma-core/providers/mlx5/mlx5_trace.h:12:38: fatal error: mlx5_trace.h: No such file or directory
12 | #define LTTNG_UST_TRACEPOINT_INCLUDE "mlx5_trace.h"
| ^~~~~~~~~~~~~~
compilation terminated.
make[2]: *** [providers/mlx5/CMakeFiles/mlx5-static.dir/build.make:76: providers/mlx5/CMakeFiles/mlx5-static.dir/mlx5_trace.c.o] Error 1
make[2]: *** Waiting for unfinished jobs....
In file included from /home/rdma-core/providers/hns/hns_roce_u_trace.c:9:
/home/rdma-core/providers/hns/hns_roce_u_trace.h:12:38: fatal error: hns_roce_u_trace.h: No such file or directory
12 | #define LTTNG_UST_TRACEPOINT_INCLUDE "hns_roce_u_trace.h"
| ^~~~~~~~~~~~~~~~~~~~
compilation terminated.
Fix it by linking the library and including drivers' directories for
static compilation.
Fixes: 382b359 ("efa: Add support for LTTng tracing")
Signed-off-by: wenglianfa <[email protected]>
Signed-off-by: Junxian Huang <[email protected]>
Create extra provider libraries for tracing so that the regular libraries does not need to have a dependency on LTTng. For example, there will be a new libhns_trace-rdmav*.so for hns tracing. Usage example: $ lttng create my_session $ lttng enable-event -u rdma_core_hns:post_send $ lttng start $ LD_PRELOAD=/usr/lib64/libibverbs/libhns_trace-rdmav*.so ib_send_bw -d hns_0 $ LD_PRELOAD=/usr/lib64/libibverbs/libhns_trace-rdmav*.so ib_send_bw -d hns_0 10.10.10.10 $ lttng stop $ lttng view No additional dependencies or performance penalty will be introduced if users don't load the tracing library explicitly as shown above. This change involves all providers that support LTTng tracing, including efa, hns, mlx5 and rxe. Signed-off-by: wenglianfa <[email protected]> Signed-off-by: Junxian Huang <[email protected]>
Define rdma_tracepoint() in the common trace.h to remove duplicate definition in drivers. Signed-off-by: wenglianfa <[email protected]> Signed-off-by: Junxian Huang <[email protected]>
Now that tracing libraries has been separated from regular providers libraries, enabling LTTng tracing by default has become feasible for release version rdma-core. Users can customize the installation of the tracing libraries according to their needs, improving the usability. Signed-off-by: wenglianfa <[email protected]> Signed-off-by: Junxian Huang <[email protected]>
|
And I still think that providing extra, special tracing libraries as part of rdma-core is wrong approach. |
I came here looking for tracing in rdma-core too, after recent involvement in debugging a RoCE setup. One issue we encountered was the way a number of the APIs pass back a NULL and no other error context on failure (errno from failed ioctls would have been desirable), making it difficult to triage many classes of failure. Instead of LTTng, would it be possible to embed support for USDT style traces? The eBPF folk have recently provided a header-only solution - https://github.com/libbpf/usdt/blob/main/usdt.h - which I've had good success with in two other projects. It requires no additional runtime or build dependencies, as they encourage embedding that one header file directly into your project. The USDT implementation involves insertion of noop instructions at instrumentation points, and some additional annotations in the library (on-disk), so it is essentially zero overhead when the traces are not in use. In my case earlier, traces including the errno on API failure paths where NULL-return-indicates-failure would be very useful to have readily available. |
We can support both, patches are welcomed. |
Default to providing lightweight USDT trace points when LTTng is unavailable. This piggybacks on the existing tracing code added for LTTng for a minimal set of changes. > $ sudo bpftrace -l usdt:build/lib/lib*.so:* > usdt:build/lib/libefa-rdmav59.so:rdma_core_efa:post_recv > usdt:build/lib/libefa-rdmav59.so:rdma_core_efa:post_send > usdt:build/lib/libefa-rdmav59.so:rdma_core_efa:process_completion > usdt:build/lib/libefa.so:rdma_core_efa:post_recv > usdt:build/lib/libefa.so:rdma_core_efa:post_send > usdt:build/lib/libefa.so:rdma_core_efa:process_completion > usdt:build/lib/libhns-rdmav59.so:rdma_core_hns:poll_cq > usdt:build/lib/libhns-rdmav59.so:rdma_core_hns:post_recv > usdt:build/lib/libhns-rdmav59.so:rdma_core_hns:post_send > usdt:build/lib/libhns.so:rdma_core_hns:poll_cq > usdt:build/lib/libhns.so:rdma_core_hns:post_recv > usdt:build/lib/libhns.so:rdma_core_hns:post_send > usdt:build/lib/libmlx5-rdmav59.so:rdma_core_mlx5:post_send > usdt:build/lib/libmlx5.so:rdma_core_mlx5:post_send > usdt:build/lib/librxe-rdmav59.so:rdma_core_rxe:post_send The USDT header used here is from the libbpf/usdt project at https://github.com/libbpf/usdt.git Further background discussion for this commit is included in linux-rdma#1621 Signed-off-by: Nathan Scott <[email protected]>
Default to providing lightweight USDT trace points when LTTng is unavailable. This piggybacks on the existing tracing code added for LTTng for a minimal set of changes. > $ sudo bpftrace -l usdt:build/lib/lib*.so:* > usdt:build/lib/libefa-rdmav59.so:rdma_core_efa:post_recv > usdt:build/lib/libefa-rdmav59.so:rdma_core_efa:post_send > usdt:build/lib/libefa-rdmav59.so:rdma_core_efa:process_completion > usdt:build/lib/libefa.so:rdma_core_efa:post_recv > usdt:build/lib/libefa.so:rdma_core_efa:post_send > usdt:build/lib/libefa.so:rdma_core_efa:process_completion > usdt:build/lib/libhns-rdmav59.so:rdma_core_hns:poll_cq > usdt:build/lib/libhns-rdmav59.so:rdma_core_hns:post_recv > usdt:build/lib/libhns-rdmav59.so:rdma_core_hns:post_send > usdt:build/lib/libhns.so:rdma_core_hns:poll_cq > usdt:build/lib/libhns.so:rdma_core_hns:post_recv > usdt:build/lib/libhns.so:rdma_core_hns:post_send > usdt:build/lib/libmlx5-rdmav59.so:rdma_core_mlx5:post_send > usdt:build/lib/libmlx5.so:rdma_core_mlx5:post_send > usdt:build/lib/librxe-rdmav59.so:rdma_core_rxe:post_send The USDT header used here is from the libbpf/usdt project at https://github.com/libbpf/usdt.git Further background discussion for this commit is included in linux-rdma#1621 Signed-off-by: Nathan Scott <[email protected]>
This PR is an RFC to support dynamically loading LTTng tracing.
As mentioned in our brief discussion previously in #1587, there are some benefits to supporting this:
We believe that this can greatly improve the usability of LTTng tracing.
BTW, the first patch is included incidentally and isn’t actually part of this feature. It’s meant to fix the static compilation failures when enabling LTTng.